Indexing Structures for Approximate String Matching

نویسندگان

  • Alessandra Gabriele
  • Filippo Mignosi
  • Antonio Restivo
  • Marinella Sciortino
چکیده

In this paper we give the first, to our knowledge, structures and corresponding algorithms for approximate indexing, by considering the Hamming distance, having the following properties. i) Their size is linear times a polylog of the size of the text on average. ii) For each pattern x, the time spent by our algorithms for finding the list occ(x) of all occurrences of a pattern x in the text, up to a certain distance, is proportional on average to |x| + |occ(x)|, under an additional but realistic hypothesis.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Indexing Methods for Approximate String Matching

Indexing for approximate text searching is a novel problem receiving much attention because of its applications in signal processing, computational biology and text retrieval, to name a few. We classify most indexing methods in a taxonomy that helps understand their essential features. We show that the existing methods, rather than completely diierent as they are regarded, form a range of solut...

متن کامل

A Hybrid Indexing Method for Approximate String Matching

We present a new indexing method for the approximate string matching problem. The method is based on a suffix array combined with a partitioning of the pattern. We analyze the resulting algorithm and show that the average retrieval time is , for some that depends on the error fraction tolerated and the alphabet size . It is shown that for approximately , where . The space required is four times...

متن کامل

A New Indexing Methodfor Approximate String Matching ? Gonzalo

We present a new indexing method for the approximate string matching problem. The method is based on a suux tree combined with a partitioning of the pattern. We analyze the resulting algorithm and show that the retrieval time is O(n), for 0 < < 1, whenever < 1 ? e= p , where is the error level tolerated and is the alphabet size. We experimentally show that this index outperforms by far all othe...

متن کامل

Approximate String Matching ? Edgar

We present a radically new indexing approach for approximate string matching. The scheme uses the metric properties of the edit distance and can be applied to any other metric between strings. We build a metric space where the sites are the nodes of the suux tree of the text, and the approximate query is seen as a proximity query on that metric space. This permits us nding the R occurrences of ...

متن کامل

Finding Approximate Matches in Large Lexicons

Approximate string matching is used for spelling correction and personal name matching. In this paper we show how to use string matching techniques in conjunction with lexicon indexes to find approximate matches in a large lexicon. We test several lexicon indexing techniques, including n-grams and permuted lexicons, and several string matching techniques, including string similarity measures an...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003